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Abstract: In the present paper we study distance models for the analysis of three-way 
contingency tables. Specifically, we will study three-way association under these models 
measured by the second order odds ratio. Two kinds of distance models will be studied: 
(a) Models for three-way tables where each way is treated on an equal footing; (b) Models 
for multiple two-way tables, where one of the three ways has a special importance. For 
the first kind of models, called triadic distance models, we will show that there exists a 
natural conjugacy between the Exponential—p similarity function, the L,-transform and 
the Minkowski—p distance. For triadic distance models defined by the L,-transform we 
will prove that they do not model three-way association. Moreover, triadic distance models 
defined by the L,-transform are restricted multiple dyadic distances, where each dyadic 
distance is defined for a two-way margin of the three-way table. Distance models for three- 
way two-mode data, called three-way distance models, do succeed in modeling three-way 
association. 
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1. Introduction 


The analysis of three-way tables has received much attention in the last 
few decades. Most models for two-way data analysis can readily be adapted to 
three-way data. For example, the log-linear model for three-way tables is well 
defined. Once we get into the field of scalar products and distances, however, 
things get a little more complicated. Joly and Le Calvé (1995, p. 192), for 
example, write 


“In the two-way case, matrix analysis and the theory of vector spaces 
provide well adapted tools. Therefore, it seems natural to generalize 
the notions of scalar products and distances to three-way tables (and 
to N-way tables). Indeed, the generalization of the scalar product 
defined as a trilinear map is immediate, but unfortunately it leads to 
many difficulties because of the theoretical complexity of the ten- 
sor product of order 3 (see Franc 1989; Denis and Dhorn, 1989). 
However, the generalization of the notion of distance turns out to be 
straightforward and very useful, at least in our opinion.” 


In the present paper we will show that generalizations of distance models, 
called triadic distance models, also have their limitations. Triadic distances are 
distances defined on triples of points. We will focus on one specific family of 
triadic distance models formed by the L,,-transform (see Section 3.1). These are 
triadic distances defined on dyadic distances. We will show that triadic distance 
models based on the L,-transform do not model three-way association but only 
two-way marginal association. 

Other distance models for three-way data, where the distances are not 
defined on triples of points, will also be discussed. More specifically, the 
INDSCAL-model (Carroll and Chang 1970), and the model proposed by Okada 
and Imaizumi (1997) for the analysis of three-way two-mode asymmetric prox- 
imity data are studied. As we will see, these models do model three-way asso- 
ciation. 

We will focus on models for the analysis of three-way contingency tables. 
More specifically, we study models for two kinds of three-way tables: (a) three- 
way tables where each way is treated equally; (b) multiple two-way contingency 
tables, i.e., two-way tables obtained for different cohorts, different samples, or 
different time-points. In this latter kind of tables, one of the three ways has a 
special importance, whereas the first kind of tables each way is equally impor- 
tant. For two-way contingency tables, De Rooij and Heiser (2000b) presented 
distance-association models. In the present paper these distance-association models 
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will be generalized to three-way tables. We will discuss Minkowski distances 
and unfolding distances. It is important to note that the Minkowski distances 
and three-way generalizations thereof can only be applied to square tables, i.e., 
tables where each way corresponds to the same variable. 

To study association in contingency tables we first need to define associ- 
ation. We follow Rudas (1998, p. 9): “... one qualitative definition of associa- 
tion is that it is the information in the joint distribution (i.e., in the contingency 
table) not contained in the marginal distributions’. A measure of association 
congruent with this definition is the odds ratio. Therefore we will use the odds 
ratio, the conditional odds ratio, and the second order odds ratio (also called 
the ratio of odds ratios). An advantage of the odds ratio over other measures 
of association is its variation independence (Rudas 1998, p. 10); i.e., the odds 
ratio can vary independently from the marginals. 

The odds ratio (Oiv jj) for a two-way contingency table is defined as 

Oring = ES (1) 
Ti 5 May! 
where 7;; is the expected probability of an observation in cell ¿j for i = 
DORY I, j = 1,...,J under a specified model. The conditional odds ratio 
i jjike Æ = 1l,- , K) for a three-way table is equal to the odds ratio given a 
value on the third way. In mathematical terms, 


TigkT ijk 
Oesp = meem ( 


N 


TijkTij'k 


The conditional odds ratio is a measure of two-way association for a slice of the 
three-way table. A measure of three-way association is the second order odds 
ratio, also called ratio of odds ratios (Oj); jp’); Le. 

_ TijkTi jkT je Tij'k' 


Dit 55"! = ŝ (3) 
Ty! ik! NijkTij'kTijk' 


which is equal to the ratio of conditional odds ratios for the slices k and k’; that 
iS, 


_ Pwjj'ik 


Dis 5 jt kk! g (4) 


ii'jj' |k’ 

There is no three-way association when the second order odds ratio is equal to 
one, i.e., Ôiyjjkkt = 1, or in other words, when the conditional odds ratio is 
independent of k. Below we will often use the logarithm of the second order 
odds ratio, which equals zero when there is no three-way association. 
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2. Models for Three-way Tables and Distance Restrictions 


Basic models for the analysis of contingency tables are the log-linear 
models. We will present a brief discussion of these models. Before doing so, 
we will present the same models in a multiplicative framework. We end this 
section with a discussion of the transformation of association parameters of a 
multiplicative model into distances. 


2.1. Multiplicative Models for the Cell Probabilities 


The saturated multiplicative model for a three-way contingency table can 
be written as 


Tijk = paR AS af ne ne NE Nk (5) 
The @ parameters denote main effects of the variables; the 7 parameters denote 
association between variables. The term p is a general constant. The saturated 
model is uninteresting in itself because it always fits perfectly and does not 
give any reduction of data. To obtain more interesting models it is possible to 
restrict sets of parameters. Usual restrictions are: (a) setting some association 
parameters equal to a prespecified value (1 for no association, another constant 
for uniform association); (b) restricting a set of association parameters to be of 
reduced rank. In the present paper we will constrain the association parameters 
to fulfill the metric axioms. First however, we will show the equivalent of model 
(5) in log-linear form. 


2.2. The Log-linear Parameterization 


A more familiar parameterization of model (5) is the log-linear model, 
which is obtained by taking the natural logarithm on both sides. We then obtain 
log(mijk) = A+ AP HAF + AL + ABS + ARP + AGP + ARV? (6) 
where log(z) = A, log(a?) = AŽ, etc.. Many authors have studied these 
models; we refer to Bishop, Fienberg, and Holland (1975), Fienberg (1980), 
and Agresti (1990). 
Here we will introduce more notation. Let us rewrite (6) as 
log (mje) = Alje + Aiks (7) 
where Mik =A+ AF + Ag + AJ’, the terms that will not be transformed to 
distances but will be kept in the model as such; A the set of (two-way and 
three-way) association terms that will be transformed to distances. Before every 
transformation we have to define the set Moe: 
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2.3. Transformation of Association Parameters 


The parameters for both the log-linear model and the multiplicative model 
are not always easily interpretable. Transformation of the parameters into a dis- 
tance model may enhance interpretability: a small distance corresponds to a 
large association, i.e., a larger number than can be expected on basis of the 
marginal parameters (i.e., the set Pa a large distance corresponds to low as- 
sociation, i.e., a smaller number than can be expected from the marginal param- 
eters. To attain this goal, a monotone decreasing function of the multiplicative 
association parameters should be used. A family of transformations is given by 
the exponential—p similarity function; that is 


n = exp(—d?), (8) 
for p > 1, d is a distance satisfying the metric axioms. We do not use sub- 


scripts here, because we will below apply this transformation to both two- and 
three-way association parameters. Two special cases deserve attention: The 
exponential decay function with p = 1 and the Gaussian transformation with 
p = 2. These transformation were proposed by, among others, Shepard (1957, 
1987) and Nosofsky (1985). Taking the natural logarithm on both sides the 
transformation can be written as 


N= ds (9) 
3. Triadic Distance Models 


In the present section we will introduce triadic distance models. Such 
models have been proposed by Hayashi (1972), Cox, Cox, and Branco (1991), 
Pan and Harris (1991), Joly and Le Calvé (1995), Daws (1996), Heiser and 
Bennani (1997), and De Rooij and Heiser (2000a). 

Joly and Le Calvé (1995) and Heiser and Bennani (1997) both gave an 
axiomatic framework for the study of triadic distances. Within these frame- 
works, a number of triadic distance models have been proposed. One family of 
such models is formed by the Lp-transform, which will be discussed in the next 
section. De Rooij and Heiser (2000a) presented triadic distance models for the 
analysis of three-way asymmetric proximities. The models are generalizations 
of the Ly-model, in the same sense that multidimensional unfolding models are 
generalizations of multidimensional scaling models (Heiser 1981, 1987). In the 
remainder of this paper our discussion will be limited to the family of triadic 
distance models formed by the L,,-transform and the generalizations proposed 
by De Rooij and Heiser (2000a). 
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3.1. The L,-transform 


Triadic distances via the L,-transform are based on dyadic distances. The 
family of triadic distances formed by the L,-transform is defined as 
diik = datda t de, (10) 
for p > 1, where dijk denotes a triadic distance, and dj; a dyadic distance. 
Heiser and Bennani (1997) showed this family of triadic distances satisfies their 
axioms, if the dyadic distances satisfy the metric axioms of minimality, posi- 
tivity, symmetry, and the triangle inequality. Two specific models received at- 
tention by Cox, Cox, and Branco (1991), Joly and Le Calvé (1995), and Heiser 
and Bennani (1997), that is, the perimeter model (p = 1) and the generalized 
Euclidean model (p = 2). We will focus on the generalized Euclidean model, 
but our conclusions generalize to other triadic distance models defined by the 
Lp-transform. 

To enhance interpretation of the triadic distance model, and specifically 
the generalized Euclidean model, the reader is referred to De Rooij and Heiser 
(2000a), who provide an extensive discussion of the generalized Euclidean 
model with examples. 


3.2. Natural Conjugacy between Exponential—p Similarity Function, the 
Ly-transform and Minkowski—p Distances 


The most familiar distances are the Minkowski—p distances, defined by 


M 
diz(X) = © |Tim Fi 2 jm|?)'/?, (11) 

m=1 
where p > 1 is the Minkowski parameter, X is the J x M matrix with coordinate 
values, and M is the dimensionality. The Minkowski distance with p = 1 
is known as the city-block distance, the Minkowski distance with p = 2 is 
known as the Euclidean distance, and is probably the best known and most 
used distance. Another interesting distance is the dominance metric, for which 
p = oo. The dominance distance equals the absolute difference between the 

two points on whichever dimension they are furthest apart. 

De Rooij and Heiser (2000a) noted that for p = 1, 2 the triadic distance 
formed by the L,-transform together with the Minkowski—p distance is pro- 
portional to a natural measure of dispersion. For p = 1 the triadic distance is 
proportional to the sum of ranges over dimensions, which in one dimension is 
just the range; for p = 2 the triadic distance is proportional to the square root 
of the inertia of the three points considered, which in one dimension reduces to 
the standard deviation. 
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Here we generalize that observation by including the exponential—p sim- 
larity function and other Minkowski metrics. Combining (8), (10), and (11) we 
have 


— ijt = di(X) 
= di’ (X) 4 da (X) + di. (X) 


a P i p fa x : a a ) 
= D [Tim — Lien | IP Do [Tim — 4 paul vite X (Eim k Timi" : 
m 


m m 

(12) 
Using the exponential—p similarity function, with the L,-transform and the 
Minkowski—p distance, we obtain additivity over dimensions, which is math- 
ematically attractive. Special attention is needed here for the case p = œœ. 
The L-transform selects the largest dyadic distance, and as noted above, the 
Minkowski—oo distance is equal to the distance between the two points on 
whichever dimension they are farthest apart. The triadic distance reduces to 
this distance, i.e., 


dir = max(d;;, dix, djr) 
= max (max [Lim — Ljm|, MAX |Tim — Lem|, MAX |Lim — Lem I) , 
m m m a 
(13) 


In the remainder of this paper, we will use the conjunction Gaussian trans- 
formation, generalized Euclidean model, and Euclidean distances (all p = 2). 
However, our conclusions generalize to other values of p > 1. 


3.3. Three-way Association 


We will transform all two- and three-way association terms of model (6) 


to a triadic distance. That is, A14, = ARC + ARP + AGP + ABR”. Our model 
then becomes 
I 2 
log(mijk) = Aijk — dijk 
= ANAM AP HAR — Tig, (14) 


where dijg is defined as in (10) with p = 2. 


Proposition 1: The model as defined in (14) with dijk defined by the Lo- 
transform does not model three-way association. 


Proof: The log of the second order odds ratio under this model is equal to 


PA 


3 cies 2 2 2 2 
log (Oj 5j'&k!) = ijk ii 51k dijk dijk 
2 2 2 2 
tdr pp tOr + dipe + ijg (15) 
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Using the definition of the generalized Euclidean model to decompose the tri- 
adic distance we obtain 


log(birjjrw) = — diy — din — djr — die — din, — Grn 
— By — die — By — Bye dge — Bry 
+ dij + dig + dings + diy + dig + Gp 
+ di +d + diy + diy + diy + diy,, (16) 
from which all terms drop out; so we obtain 


log (Gis j jhe) = 0. (17) 


Conclusion: With triadic distance models we are not modeling any three-way 
association, but only two-way marginal association. Triadic distance models 
are useful, but they do not model three-way association. They do give us useful 
information on the two-way marginal association, as can be seen from the con- 
ditional odds-ratio. The log of the conditional odds ratio given k can be written 
as 


log (Gin 551k) = — dijn — diye + di jn + dijk 


The latter expression does not depend on k, indicating again that no three-way 
association is modeled. 


3.4. Triadic versus Dyadic Models 


Until now we did not distinguish between Minkowski distances and un- 
folding distances. Here we will make that distinction. Minkowski distances are 
based on one coordinate matrix X and have been defined earlier by (11). For 
the Euclidean metric the distance is 


1/2 
djj(X) = [Stem 7 cin) . (19) 
m 

We will call this expression the Euclidean case of the Minkowski distance or 
simply distance; it is symmetric because d;;(X) = dji(X). A generalization 
of the Minkowski distance is the unfolding distance, based on two coordinate 
matrices X and Y. The unfolding distance in the Euclidean metric is defined 
as 


1/2 
dij(X; Y) ia tein ay inl . (20) 


m 
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We will call this quantity the unfolding distance, and when we use it to represent 
association in a square table, the association is asymmetric because in general 
dij( X; Y) A dji(X; Y). The unfolding distance is equal to the corresponding 
Minkowski distance if and only if X = Y. 

This distinction between the Minkowski and unfolding distance can be 
generalized to triadic distances, as was done by De Rooij and Heiser (2000a) 
for the generalized Euclidean model. The distance model is written 


19 


: 2 5 s 1/2 
dijk(X) = [d X) + (X) + d3,(X)] (21) 
The adaptation made by De Rooij and Heiser is to replace every dyadic distance 
in (21) by a dyadic unfolding distance (20). We then obtain a triadic unfolding 
distance 


o 


i ‘) z] ) j 1/2 
dijk (X; Y; Z) = (X; Y) e SAE dCY; Z)] : (22) 


which reduces to the triadic distance model if and only if X = Y = Z. 
Analogous to this distinction, we can define two triadic association mod- 

els, both like (14). The first will be called Triadic Distance Association Model 

(TDAM) and is obtained by using the triadic distance defined by (21): 


log(mijk) = Alk — din (X). (23) 


The second is called Triadic Unfolding Association Model (TUAM) and ts ob- 
tained by using the triadic unfolding distance (22): 


log(mijk) = Afe — E(X; Y; Z). (24) 


For both models (23) and (24), Proposition 1 holds. Because TDAM and 
TUAM do not model three-way association, a comparison to models based on 
dyadic distances is of interest. 

Let us return to (7), and let AJ, = ARO + ARP + ACP. Here we set 
age 
association will be transformed separately to a dyadic distance, Ape = —d;,, 


= 0; that is, we do not model three-way association. Each two-way 


MiP = =d, and AGP = —d?.. Again, we can distinguish between Euclidean 
Minkowski and unfolding distances ((19) and (20), respectively). First consider 
the Dyadic Distance Association Model (DDAM), in which every two-way as- 
sociation is transformed separately to a Euclidean distance; that is 
Ta A, 2 2 2 

log(tijk) = Aik — di (X) — dip (Y) — dj, (4). (25) 
Each association term is based on a different coordinate matrix (X, Y, and Z), 
but all associations are symmetric. Secondly, consider the case in which every 
association term is transformed to an unfolding distance: 


) > 


log(mijk) = Aie — Gi, (K4;K?) — dp (Y^ Y”) — di, (Z4;Z”),(26) 
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where superscripts A and B are used to avoid a proliferation of matrix sym- 
bols, but still to distinguish between the two coordinate matrices of an unfold- 
ing distance. We will call this model the Dyadic Unfolding Association Model 
(DUAM). Of course it is possible to transform some association terms with a 
Euclidean Minkowski distance, and others by an Euclidean unfolding distance. 
We can thus make eight (including DDAM and DUAM) different dyadic asso- 
ciation models, but we will not go into details here. 

The triadic and dyadic unfolding association models can be used for any 
three-way contingency table. As an example, consider counts fijķ of persons 
living in a specific region 2 (five categories: Aude, Gard, Herault, Lozere, and 
Pyrenees-orientale), having job 7 (nine categories: farmer, farm laborers, self- 
employed professionals, higher professionals, middle professionals, employ- 
ees, workers, workers is services, and other categories) in year k (four cate- 
gories: 1954, 1962, 1968, and 1975). This example is obtained from Bernard 
and Lavit (1985; see also Van der Heijden, 1987, p. 97). In general we feel that 
cross-classifications with many categories (more than three) are more suitable 
for analysis by distance models than, for example, with a 2 x 2 x 2 cross- 
classification. The models based on Minkowski distances can only be used in 
the case where each way refers to the same variable, i.e., a three-way one-mode 
input matrix. For an example of such data, see De Rooij and Heiser (2000a) 
who analyzed a three-way transition table involving the political votes in Swe- 
den, obtained over three consecutive elections (see Upton 1978, p. 128). In the 
case where we use the triadic unfolding models for three-way one-mode tables 
this practice leads to an asymmetric association pattern. Thus, the models based 
on unfolding distances are more generally applicable than the models based on 
Minkowski distances. 

Comparing TDAM, TUAM, DDAM, and DUAM we find the following: 
If we place symmetry restrictions on DUAM, we obtain DDAM; if we place 
equality restrictions on DUAM (i.e., X4 = Y4, XË = Z^, and Y? = ZP in 
(26)), DUAM reduces to TUAM; If we place symmetry restrictions on TUAM, 
we obtain TDAM; if we place equality restrictions on DDAM (i.e, X = Y =Z 
in (25)), we obtain TDAM. This comparison is depicted in Figure 1. We can 
conclude that triadic association models are equal to dyadic association models 
with equality restrictions. 


3.5. Conclusions, Part 1 


In this section we considered triadic distance association models. More 
specifically, we covered triadic association models for contingency tables, i.e., 
multiplicative models in which association terms are transformed to triadic 
distances. Transformations from multiplicative parameters to distances were 
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Figure 1: Relationships between different association models through symmetry and equality 
restrictions on the coordinate matrices. 


performed using the exponential—p similarity function. A natural conjugacy 
was found for the exponential—p similarity function, the Lp-transform, and the 
Minkowski—p metric. 

Using one of the conjunctions (p = 2), we found that triadic association 
models do not model three-way association according to the second order odds 
ratio. Again, please recall that this conclusion can be generalized to other values 
ofp > 1. Further expanding this result, we found that triadic association models 
are equal to dyadic association models with equality restrictions. 

Let us return for the moment to the definition of the triadic unfolding 
distance (22): 


d;..(X; Y; Z) 
= d?.(X; Y) + d3,.(X;Z) + d(Y; Z) 


= 2 2 2 Poe ihe 
=2 oS (aim ale Yim + Zkm 7 LimYjm — LimZkm — A . (27) 


m 


This distance formulation does not contain a trilinear term (ZimYjmZkm). One 
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way to model three-way association is to include such a trilinear a term in the 
definition of a triadic distance. Two problems then remain however: (a) the 
problems noted by Franc (1989) and by Denis and Dhorne (1989) as mentioned 
in the Introduction; (b) the graphical representation of such a triadic distance. 


4. Three-way Distance Models 


Having discussed triadic distance models, we should also analyze other 
distance models for three-way data. To distinguish between the models dis- 
cussed in the previous section and here, we will denote the distance models in 
the previous section by triadic distances, and the models in the present sec- 
tion as three-way distance models. For three-way two-mode data there are 
many distance models like the INDSCAL model (Carroll and Chang 1970), 
the IDIOSCAL model (Carroll and Chang 1972; Carroll and Wish 1974), the 
three-mode scaling procedure of Tucker (1972), and a model for three-way 
city-block scaling by Heiser (1989). For an overview of three-way scaling 
models see Arabie, Carroll, and DeSarbo (1987). We would like to discuss 
the INDSCAL model (Carroll and Chang, 1970) and the model devised by 
Okada and Imaizumi (1997) for asymmetric proximity data. These two mod- 
els lead to other generalizations of the distance-association models discussed 
by De Rooij and Heiser (2000b). These three-way distance models are differ- 
ent from the triadic distance models: no distance is specified between three 
points, but dyadic distances are specified, which are weighted for the third way. 
Triadic distance models are most useful for three-way one-mode data, or three- 
way three-mode data. The INDSCAL model and Okada and Imaizumi’s (1997) 
model are most useful for three-way two-mode data, that is, for example, oc- 
cupational mobility data at different points in time, or in different countries. 
To study three-way association, we will apply the Gaussian transformation to 
Ae = ABC + ARP + AGP + ARC? that is 

log(mijk) = AAP + aF FPA = disp. (28) 


We start our discussion with the INDSCAL model, and afterwards we discuss 
Okada and Imaizumi’s model. 


4.1. The INDSCAL Model 


The INDSCAL model (Carroll and Chang, 1970) in squared form is writ- 
ten as 


di, = N kn tin = Braka (29) 


m 


Distance Models 173 


Consider now (14) with dijk defined by (29). Developing the second order odds 
ratio (3) we obtain 


5 x 
log(Oivjj'kk' ) = -> WkmTim + WkmL im Bs: 2WkmLimXjm) 
m 
ey J 
= D Whi mXLjtm T Wkim@ im a 2Wk! mLitmL jm) 
m 


iS 2 <2 Fir ; ; 
( WkimLim T Wk'm2 jm — 2u ktm im£j'm) 


m 


R K7 A 
= 5 Wkm Lim T Wkm Lim — 2WemLirm= jm) 


m 
y a 2 5 
TA Wk'mTim T WkrmL jm E 2Wk'mTimTjm) 


m 
F ke. s bam 9 S, + 
+( S WkmTiim + Wkma im SW ümTjm) 
m 
a 2 9 
+( ` Wkmlim T Wkm2 jim — 2WkmLim£ jm) 
m 
9 2 : r d 
+ X Wk'mTim T Wk'mtjim — 2Wh' m Li! mL j'm ), (30) 
m 
which can be rewritten as 
log(Oivjj'kk') = 2 5 (Wkm — Whim) X (LimLjm T Lym Lk jm T Vimtjm 


m 
—LimX jm) 
= 2 Siem ae Wk'm) x (Lim a Litm)(Ljm i Ljtm). (31) 
m 
The second order odds ratio equals zero if and only if, for all m, either Wgm = 
Wk'm» OF Lim = Lim, OF Lim = Tj'm- 
The conditional odds ratio, given a source fr is 


2 2 
log (95% 5 5"\K) = > tonn (terr Esg Tim) + Wem (Lim — Ti'm) 


m 
2 x Fal 2 
“Bite Pin = Liv) = Win (Dim — & jim) 


= 2 >. Wkm(LimL jm + Litm@ jim — Lim jm — LimX;'m) 
m 
= 2 Da Wkm(Tim — Tirm)(Tjm = jtm): (32) 
m 
This form is the same as derived by De Rooij and Heiser (2000b) for one two- 
way table (i.e., only one source) when we set the weight for each source k 
on each dimension equal to one (i.e., equal dimension weights). If we define 
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1/2 : ; a ; 
ies wi? Lim and collect the y£, in a matrix Y*, the conditional odds ratio 
for slice k is given by 


di (Y") + dj (¥*) — di.(¥*) — dp (YF). (33) 


So, if we make a plot of the coordinates for slice k, we can obtain the condi- 
tional log odds ratio by the sum and difference of some squared distances. The 
conditional odds ratio given some 7 is computationally more tedious, but after 
some mathematical reformulation we obtain 


log(O;j:ee i) = > (Wem — Wem) 
m 


x fee = in) = (Bi =a epah] ° (34) 


Both conditional odds ratios are dependent on the index of the given slice. 
4.2. The Okada and Imaizumi Model 


The model of Okada and Imaizumi (1997) is defined as 


1/2 
ae 
dijk = tijk “a et: x Ti 
Sn (225) 
1/2 
ts. 
t 
+ a are R K Tj (35) 
Dee 
where 
1/2 
tijk = Wedij = Wk (Ele 5 zm?) : (36) 
m 


The model uses two kinds of weights to model differences on the third mode k, 
symmetry weights wg, and asymmetry weights ugm. For a detailed discussion 
on the model, fitting the model to data, and interpreting the model, we refer to 
Okada and Imaizumi (1997). Taking the square, the model can be written as 


2 242 
de = wedi; + ee 


wed; ie 
+2wgdij eos x (ri Es r;) 
ij 


wd? 
= (< 5) X ri X Tj, (37) 


n 
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yi 
a Ez ) 


2 
5 Cr SHI he a 
sae Se al ear J 4.9 Te | (38) 
i G V in 51k! 
This second order odds ratio equals zero if and only if either wg = wp and 
Ukm = Uk'm for all m, or all distances dij, dij, dij and dy; are zero. This 


latter condition is very rare, however. If the asymmetry weights are not equal, 
the second order odds ratio equals zero if wg = wy = 0. The model proposed 
by Okada and Imaizumi obviously is mathematically rather complex, as is nec- 
essary to give a nice representation of symmetry and asymmetry by distances 
in Euclidean space. 

The conditional odds ratio, for a source ky, is given by the first four lines of 
(38). Both expressions for the second order odds ratio and the conditional odds 
ratio are rather complex. In fact, the expressions are just the sum and differences 
(as in (15) and the first line of (18)) of the original distance definitions, with no 
reduction but just a reshuffling of the terms. Because the expressions cannot 
be simplified, we can again plot a configuration for each slice. The conditional 
odds ratio, given a slice 4, is then given by the sum and differences of the 
squared distances in this plot, just as for the INDSCAL model (see Equation 
33) 
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4.3. Conclusions, Part 2 


The three-way distance-association models discussed in this section suc- 
cessfully model three-way association, in contrast to the triadic distance-assoc- 
iation models in the Section 3. Therefore, the models have an advantage in sit- 
uations where one is specifically interested in this three-way association. Both 
distance models discussed in this Section include a trilinear term. The model 
of Okada and Imaizumi is a generalization of the INDSCAL model, including 
parts for the skew-symmetric part of the data. Setting all radii (r;) equal to each 
other, we obtain the INDSCAL model with equal dimension weights for every 
slice k. 


5. Discussion 


We studied three-way association measured by the second order odds ra- 
tio under distance models for three-way data. An interesting negative result was 
found: triadic distance association models do not model three-way association, 
but only two-way marginal association. Three-way distance association models 
do model three-way association. The three-way distance models both contain 
trilinear terms, but the triadic distance models do not. If a model includes trilin- 
ear terms, we are sure that three-way association is modeled. We feel, however, 
that trilinear terms are not the only way to include three-way information in the 
model. For example, the log-linear model does not have trilinear terms, but still 
is capable of modeling three-way association. 

Triadic distance models and three-way distance models are intended for 
the analysis of different kinds of data. The triadic distance association mod- 
els are better understood having three-way three-mode data, or three-way one- 
mode data; the three-way distance association models discussed in Section 4 
are better understood having multiple two-way data matrices, 1.e., three-way 
two-mode data. An advantage of the triadic distance model over the three-way 
distance model is the representation of each of the three two-way margins by 
distances, whereas the three-way distance models only give a distance repre- 
sentation of one of the three two-way margins by distances. 

Because the triadic distance models do not model three-way association 
but only marginal two-way association, we should not use these models when 
we are especially interested in three-way association. The triadic distance mod- 
els can give useful information about the structure in the data, however. Exam- 
ples are given in the papers by Heiser and Bennani (1997) and De Rooij and 
Heiser (2000a). We do not want to argue here that triadic distances should not 
be used. They definitely can be useful, but one should recognize the properties 
of such models. 
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The study of triadic distances should be continued, until triadic distance 
models are found that (a) model three-way association; (b) have a clear graph- 
ical representation; and (c) overcome the mathematical difficulties in fitting 
them. 
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